Goto

Collaborating Authors

 holistic approach



MixMatch: A Holistic Approach to Semi-Supervised Learning

Neural Information Processing Systems

Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that guesses low-entropy labels for data-augmented unlabeled examples and mixes labeled and unlabeled data using MixUp. MixMatch obtains state-of-the-art results by a large margin across many datasets and labeled data amounts. For example, on CIFAR-10 with 250 labels, we reduce error rate by a factor of 4 (from 38% to 11%) and by a factor of 2 on STL-10. We also demonstrate how MixMatch can help achieve a dramatically better accuracy-privacy trade-off for differential privacy.


Sparse maximal update parameterization: A holistic approach to sparse training dynamics

Neural Information Processing Systems

Several challenges make it difficult for sparse neural networks to compete with dense models. Second, sparse studies often need to test multiple sparsity levels, while also introducing new hyperparameters (HPs), leading to prohibitive tuning costs. Indeed, the standard practice is to re-use the learning HPs originally crafted for dense models. Unfortunately, we show sparse anddense networks do not share the same optimal HPs. Without stable dynamics and effective training recipes, it is costly to test sparsity at scale, which is key to surpassing dense networks and making the business case for sparsity acceleration in hardware.A holistic approach is needed to tackle these challenges and we propose S \textmu Par as one such approach.


Review for NeurIPS paper: HRN: A Holistic Approach to One Class Learning

Neural Information Processing Systems

Weaknesses: Some aspects that need to be improved are as follows. It is not precise to state that the methods based on auto-encoders, GANS, self-supervised classification are one-class learning approaches. The objective functions in these methods are very different from the one-class learning objective. I would suggest to rephrase statements similar to this one to avoid misleadings. I believe the authors should provide these results in the supplementary materials and add discussions to help readers fully understand the key difference and the insights of the H-regularization.


Review for NeurIPS paper: HRN: A Holistic Approach to One Class Learning

Neural Information Processing Systems

This paper proposes a novel deep one-class classification method where a regularization technique is specially designed for one-class classification problem. It also provides insights on the bottlenecks of previous methods for this problem; one insight is quite novel and has not been considered yet (representation learning from one-class data is biased to the given training data), since previous methods mainly focused on the other (deep network outputs become over-confident given one-class data). I am feeling the paper may further inspire more cleverly designed methods for this problem! While in the beginning the reviewers had some concerns (mainly the clarity and the generality that is related to the significance), the authors did a particularly good job in their rebuttal (showing that the proposal is not limited to a single surrogate loss function). Thus in the end, all of us have agreed to accept this paper for publication! Please carefully address the concerns from all 3 reviewers in the next version.


Reviews: MixMatch: A Holistic Approach to Semi-Supervised Learning

Neural Information Processing Systems

Originality: 7 Quality:8 Clarity: 4 Significance:7 Mixmatch combined a lot of classical extraordinary methods that used for semi-supervised learning and achieved state-of-the-art results by a large margin across many datasets and labeled data amounts. Compared to previous method, this method is not only a simple combination of different data augmentation methods and other methods, such as exponential model average (EMA), it also explores a path to fully combine the advantages of different methods. In short, this method is of course a big step for semi-supervised learning on image classification. However, the experiments on this paper still needs to be modified to be perfect and a fair comparison with previous paper, such as Mean-Teacher. Also, some small problems need to be fixed to be finally published.



A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation

Neural Information Processing Systems

In recent years, concept-based approaches have emerged as some of the most promising explainability methods to help us interpret the decisions of Artificial Neural Networks (ANNs). These methods seek to discover intelligible visual concepts'' buried within the complex patterns of ANN activations in two key steps: (1) concept extraction followed by (2) importance estimation. While these two steps are shared across methods, they all differ in their specific implementations. Here, we introduce a unifying theoretical framework that recast the first step -- concept extraction problem -- as a special case of dictionary learning, and we formalize the second step -- concept importance estimation -- as a more general form of attribution method.This framework offers several advantages as it allows us: (i) to propose new evaluation metrics for comparing different concept extraction approaches; (ii) to leverage modern attribution methods and evaluation metrics to extend and systematically evaluate state-of-the-art concept-based approaches and importance estimation techniques; (iii) to derive theoretical guarantees regarding the optimality of such methods. We further leverage our framework to try to tackle a crucial question in explainability: how to efficiently identify clusters of data points that are classified based on a similar shared strategy.To illustrate these findings and to highlight the main strategies of a model, we introduce a visual representation called the strategic cluster graph.


HRN: A Holistic Approach to One Class Learning

Neural Information Processing Systems

Existing neural network based one-class learning methods mainly use various forms of auto-encoders or GAN style adversarial training to learn a latent representation of the given one class of data. This paper proposes an entirely different approach based on a novel regularization, called holistic regularization (or H-regularization), which enables the system to consider the data holistically, not to produce a model that biases towards some features. Combined with a proposed 2-norm instance-level data normalization, we obtain an effective one-class learning method, called HRN. To our knowledge, the proposed regularization and the normalization method have not been reported before. Experimental evaluation using both benchmark image classification and traditional anomaly detection datasets show that HRN markedly outperforms the state-of-the-art existing deep/non-deep learning models.


MixMatch: A Holistic Approach to Semi-Supervised Learning

Neural Information Processing Systems

Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that guesses low-entropy labels for data-augmented unlabeled examples and mixes labeled and unlabeled data using MixUp. MixMatch obtains state-of-the-art results by a large margin across many datasets and labeled data amounts. For example, on CIFAR-10 with 250 labels, we reduce error rate by a factor of 4 (from 38% to 11%) and by a factor of 2 on STL-10. We also demonstrate how MixMatch can help achieve a dramatically better accuracy-privacy trade-off for differential privacy.